French Learners Audio Corpus of German Speech (FLACGS)
نویسندگان
چکیده
The French Learners Audio Corpus of German Speech (FLACGS) was created to compare German speech production of German native speakers (GG) and French learners of German (FG) across three speech production tasks of increasing production complexity: repetition, reading and picture description. 40 speakers, 20 GG and 20 FG performed each of the three tasks, which in total leads to approximately 7h of speech. The corpus was manually transcribed and automatically aligned. Analysis that can be performed on this type of corpus are for instance segmental differences in the speech production of L2 learners compared to native speakers. We chose the realization of the velar nasal consonant /N/. In spoken French, /N/ does not appear in a VCV context which leads to production difficulties in FG. With increasing speech production complexity (reading and picture description), /N/ is realized as [Ng] by FG in over 50% of the cases. The results of a two way ANOVA with unequal sample sizes on the durations of the different realizations of engma indicate that duration is a reliable factor to distinguish between [N] and [Ng] in FG productions compared to the [N] productions in GG in a VCV context. The FLACGS corpus allows to study L2 production and perception.
منابع مشابه
The IFCASL Corpus of French and German Non-native and Native Read Speech
The IFCASL corpus is a French-German bilingual phonetic learner corpus designed, recorded and annotated in a project on individualized feedback in computer-assisted spoken language learning. The motivation for setting up this corpus was that there is no phonetically annotated and segmented corpus for this language pair of comparable of size and coverage. In contrast to most learner corpora, the...
متن کاملDesigning a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
We present the design of a corpus of native and non-native speech for the language pair French-German, with a special emphasis on phonetic and prosodic aspects. To our knowledge there is no suitable corpus, in terms of size and coverage, currently available for the target language pair. To select the target L1-L2 interference phenomena we prepare a small preliminary corpus (corpus1), which is a...
متن کاملMicrosoft Speech Language Translation (MSLT) Corpus: The IWSLT 2016 release for English, French and German
We describe the Microsoft Speech Language Translation (MSLT) corpus, which was created in order to evaluate endto-end conversational speech translation quality. The corpus was created from actual conversations over Skype, and we provide details on the recording setup and the different layers of associated text data. The corpus release includes Test and Dev sets with reference transcripts for sp...
متن کاملInter-annotator agreement for a speech corpus pronounced by French and German language learners
This paper presents the results of an investigation of interannotator agreement for the non-native and native French part of the IFCASL corpus. This large bilingual speech corpus for French and German language learners was manually annotated by several annotators. This manual annotation is the starting point which will be used both to improve the automatic segmentation algorithms and derive dia...
متن کاملProductions of /h/ in German: French vs. German speakers
This paper investigates the production of /h/ by French learners of German in comparison to German native speakers. Anecdotally, French speakers are assumed to delete /h/ when speaking German. We investigate the extent to which learners have problems producing /h/ and if advanced learners show different production patterns than beginners. When French speakers produce /h/ our analysis focuses on...
متن کامل